The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
由于组织和骨骼之间的相似性,在人解剖结构中广泛看到了全球相关性。由于近距离质子密度和T1/T2参数,这些相关性反映在磁共振成像(MRI)扫描中。此外,为了实现加速的MRI,k空间数据的采样不足,从而导致全球混叠伪像。卷积神经网络(CNN)模型被广泛用于加速MRI重建,但是由于卷积操作的固有位置,这些模型在捕获全球相关性方面受到限制。基于自发的变压器模型能够捕获图像特征之间的全局相关性,但是,变压器模型对MRI重建的当前贡献是微小的。现有的贡献主要提供CNN转换器混合解决方案,并且很少利用MRI的物理学。在本文中,我们提出了一种基于物理的独立(无卷积)变压器模型,标题为“多头级联SWIN变压器(MCSTRA),用于加速MRI重建。 MCSTRA将几种相互关联的MRI物理相关概念与变压器网络相结合:它通过移动的窗口自我发场机制利用了全局MR特征;它使用多头设置分别提取属于不同光谱组件的MR特征;它通过级联的网络在中间脱氧和K空间校正之间进行迭代,该网络具有K空间和中间损耗计算中的数据一致性;此外,我们提出了一种新型的位置嵌入生成机制,以使用对应于底面采样掩码的点扩散函数来指导自我发作。我们的模型在视觉上和定量上都大大优于最先进的MRI重建方法,同时描述了改善的分辨率和去除词法。
translated by 谷歌翻译
Human activity recognition (HAR) using drone-mounted cameras has attracted considerable interest from the computer vision research community in recent years. A robust and efficient HAR system has a pivotal role in fields like video surveillance, crowd behavior analysis, sports analysis, and human-computer interaction. What makes it challenging are the complex poses, understanding different viewpoints, and the environmental scenarios where the action is taking place. To address such complexities, in this paper, we propose a novel Sparse Weighted Temporal Attention (SWTA) module to utilize sparsely sampled video frames for obtaining global weighted temporal attention. The proposed SWTA is comprised of two parts. First, temporal segment network that sparsely samples a given set of frames. Second, weighted temporal attention, which incorporates a fusion of attention maps derived from optical flow, with raw RGB images. This is followed by a basenet network, which comprises a convolutional neural network (CNN) module along with fully connected layers that provide us with activity recognition. The SWTA network can be used as a plug-in module to the existing deep CNN architectures, for optimizing them to learn temporal information by eliminating the need for a separate temporal stream. It has been evaluated on three publicly available benchmark datasets, namely Okutama, MOD20, and Drone-Action. The proposed model has received an accuracy of 72.76%, 92.56%, and 78.86% on the respective datasets thereby surpassing the previous state-of-the-art performances by a margin of 25.26%, 18.56%, and 2.94%, respectively.
translated by 谷歌翻译
Drone-camera based human activity recognition (HAR) has received significant attention from the computer vision research community in the past few years. A robust and efficient HAR system has a pivotal role in fields like video surveillance, crowd behavior analysis, sports analysis, and human-computer interaction. What makes it challenging are the complex poses, understanding different viewpoints, and the environmental scenarios where the action is taking place. To address such complexities, in this paper, we propose a novel Sparse Weighted Temporal Fusion (SWTF) module to utilize sparsely sampled video frames for obtaining global weighted temporal fusion outcome. The proposed SWTF is divided into two components. First, a temporal segment network that sparsely samples a given set of frames. Second, weighted temporal fusion, that incorporates a fusion of feature maps derived from optical flow, with raw RGB images. This is followed by base-network, which comprises a convolutional neural network module along with fully connected layers that provide us with activity recognition. The SWTF network can be used as a plug-in module to the existing deep CNN architectures, for optimizing them to learn temporal information by eliminating the need for a separate temporal stream. It has been evaluated on three publicly available benchmark datasets, namely Okutama, MOD20, and Drone-Action. The proposed model has received an accuracy of 72.76%, 92.56%, and 78.86% on the respective datasets thereby surpassing the previous state-of-the-art performances by a significant margin.
translated by 谷歌翻译
图像二进制技术通常用于增强嘈杂和/或退化的图像来迎合不同文档图像Anlaysis(DIA)应用(如单词斑点,文档检索和OCR)。大多数现有技术都集中在将像素图像馈送到卷积神经网络中以完成文档二进制化,这在使用不完全减压的情况下需要处理的压缩图像时可能不会产生有效的结果。因此,在本研究论文中,通过使用双重鉴别器生成对抗网络(DD-GAN),提出了使用JPEG压缩图像的文档图像二进制的想法。在这里,两个歧视者网络 - 全球和本地工作在不同的图像比率上,并将焦点损失用作发电机损失。提出的模型已通过不同版本的DIBCO数据集进行了彻底的测试,该数据集具有诸如孔,擦除或弄脏的墨水,灰尘和放错地方的挑战。在时间和空间复杂性方面,该模型被证明是高度鲁棒,有效的,并且还导致了JPEG压缩域中的最新性能。
translated by 谷歌翻译
基于物理学的模型已成为流体动力学的主流,用于开发预测模型。近年来,由于数据科学,处理单元,基于神经网络的技术和传感器适应性的快速发展,机器学习为流体社区提供了复兴。到目前为止,在流体动力学中的许多应用中,机器学习方法主要集中在标准过程上,该过程需要将培训数据集中在指定机器或数据中心上。在这封信中,我们提出了一种联合机器学习方法,该方法使本地化客户能够协作学习一个汇总和共享的预测模型,同时将所有培训数据保留在每个边缘设备上。我们证明了这种分散学习方法的可行性和前景,并努力为重建时空领域建立深度学习的替代模型。我们的结果表明,联合机器学习可能是设计与流体动力学相关的高度准确预测分散的数字双胞胎的可行工具。
translated by 谷歌翻译
剖面隐藏的马尔可夫模型(PHMM)广泛用于许多生物信息学应用中,以准确识别生物学序列(例如DNA或蛋白质序列)之间的相似性。 PHMM使用常用和高度精确的方法(称为Baum-Welch算法)来计算这些相似性。但是,Baum-Welch算法在计算上很昂贵,现有作品为固定的PHMM设计提供了软件或仅硬件解决方案。当我们分析最先进的作品时,我们发现迫切需要灵活,高性能和节能的硬件软件共同设计,以有效地有效地解决所有主要效率低下的效率PHMM的Baum-Welch算法。我们提出了APHMM,这是第一个灵活的加速框架,可以显着减少PHMM的Baum-Welch算法的计算和能量开销。 APHMM利用硬件软件共同设计来解决Baum-Welch算法中的主要效率低下,通过1)设计灵活的硬件来支持不同的PHMMS设计,2)利用可预测的数据依赖性模式,并使用chip Memory的片段记忆,使用纪念活动技术,memoigience Memoriques,Memoigience Memoriques,Memoigient, 3)通过基于硬件的过滤器快速消除可忽略的计算,4)最小化冗余计算。我们在专用硬件和2)GPU的软件优化方面实现了我们的1)硬件软件优化,以为PHMM提供首个灵活的Baum-Welch加速器。与Baum-Welch算法的CPU,GPU和FPGA实现相比,APHMM提供的显着加速度为15.55 x-260.03x,1.83x-5.34x和27.97倍,分别为27.97倍。 APHMM的表现优于三个重要的生物信息学应用程序的最新CPU实现,1)错误校正,2)蛋白质家族搜索和3)多个序列对齐,比1.29x-59.94x,1.03x-1.75x和分别为1.03x-1.95x。
translated by 谷歌翻译
由于人口和全球化的增加,对能源的需求大大增加。因此,准确的能源消耗预测已成为政府规划,减少能源浪费和能源管理系统稳定运行的基本先决条件。在这项工作中,我们介绍了对家庭能耗的时间序列预测的主要机器学习模型的比较分析。具体来说,我们使用WEKA(一种数据挖掘工具)首先将模型应用于Kaggle数据科学界可获得的小时和每日家庭能源消耗数据集。应用的模型是:多层感知器,K最近的邻居回归,支持向量回归,线性回归和高斯过程。其次,我们还在Python实施了时间序列预测模型Arima和Var,以预测有或没有天气数据的韩国家庭能源消耗。我们的结果表明,预测能源消耗预测的最佳方法是支持向量回归,然后是多层感知器和高斯过程回归。
translated by 谷歌翻译
介词经常出现多元化词。歧义歧义在语义角色标记,问题应答,文本征报和名词复合释义中,歧义是至关重要的。在本文中,我们提出了一种新颖的介词意义消费者(PSD)方法,其不使用任何语言工具。在监督设置中,机器学习模型提出有句子,其中介词已经用感测量注释。这些感官是ID所谓的介词项目(TPP)。我们使用预先训练的BERT和BERT VARIANTS的隐藏层表示。然后使用多层Perceptron将潜在的表示分为正确的感测ID。用于此任务的数据集来自Semeval-2007任务-6。我们的方法理解为86.85%,比最先进的更好。
translated by 谷歌翻译
视频内容分类是计算机视觉中的重要研究内容,它广泛用于许多领域,例如图像和视频检索,计算机视觉。本文提出了一种模型,它是卷积神经网络(CNN)和经常性神经网络(RNN)的组合,其开发,列车和优化了可以识别视频内容类型的深度学习网络,并将它们分类为诸如“的类别”动画,游戏,自然内容,平面内容等。为了增强模型新颖的关键帧提取方法,包括仅将关键帧分类,从而减少整个处理时间而不牺牲任何显着性能。
translated by 谷歌翻译